Coloring random graphs 
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We study the graph coloring problem over random graphs of finite average connectivity c. Given 
a number q of available colors, we find that graphs with low connectivity admit almost always a 
proper coloring whereas graphs with high connectivity are uncolorable. Depending on q, we find 
the precise value of the critical average connectivity c q . Moreover, we show that below c q there 
exist a clustering phase c £ [ca, c q ] in which ground states spontaneously divide into an exponential 
number of clusters and where the proliferation of metastable states is responsible for the onset of 
complexity in local search algorithms. 

PACS numbers: 89.20.Ff, 75.10.Nr, 05.70.Fh, 02.70.-c 
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The Graph Coloring problem (COL) is a very basic 
and famous problems in combinatorics (!]] and in statis- 
tical physics (2). Given a graph, or a lattice, and given 
a number q of available colors, the problem consists in 
finding a coloring of vertices such that no two neighbor- 
ing vertices have the same color. The minimally needed 
number of colors is the chromatic number of the graph. 

For planar graphs there exists a famous theorem || 
showing that four colors are sufficient, and that a coloring 
can be found by an efficient algorithm. On the contrary, 
for general graphs the problem is computationally hard 
to solve: already in 1972 it was shown that Graph Col- 
oring is NP-complete Q which means, roughly speaking, 
that the time required for determining the existence of a 
proper coloring grows exponentially with the graph size. 

In modern computer science, graph coloring is taken as 
one of the most widely used benchmarks for the evalua- 
tion of algorithm performance (^]. The interest in color- 
ing stems from the fact that many real- world combinato- 
rial optimization problems have component sub-problems 
which can be easily represented as coloring problems. For 
instance, a classical application is the scheduling of reg- 
isters in the central processing unit of computers. All 
variables manipulated by the program are characterized 
by ranges of times during which their values are left un- 
changed. Any two variables that change during the same 
time interval cannot be stored in the same register. One 
may represent the overall computation by constructing a 
graph where each variable is associated with a vertex and 
edges are placed between any two vertices whose corre- 
sponding variables change during the same time interval. 
A proper coloring with a minimal number of colors of 
this graph provides an optimal scheduling for registers: 
two variables with the same color will not be connected 
by an edge and so can be assigned to the same register 
(since they change in different time intervals). 

The q-coloring problem of random graphs represents 



a very active field of research in discrete mathematics 
which constitutes the natural evolution of the percolation 
theory initiated by Erdos and Renyi in the 50's ||. One 
point of contact between computer science and random 
graph theory arises from the observation that, for large 
random graphs, there exists a critical average connectiv- 
ity beyond which the graphs become uncolorable with 
probability tending to one as the graph size goes to in- 
finity. This transition will be called the g-COL/UNCOL 
transition throughout this paper. The precise value of the 
critical connectivity depends of course on the number q of 
allowed colors and on the ensemble of random graphs un- 
der consideration. Graphs generated close to their crit- 
ical connectivity are extraordinarily hard to color and 
therefore the study of critical instances is at the same 
time a well posed mathematical question as well as an 
algorithmic challenge for the understanding of the onset 
of computational complexity [Q, pj. The notion of com- 
putational complexity refers to worst-case instances and 
therefore results for a given ensemble of problems might 
not be of direct relevance. However, on the more prac- 
tical side, algorithms which are used to solve real-world 
problems display a huge variability of running times and 
a theory for their typical-case behavior, on classes of non- 
trivial random instances, constitutes the natural comple- 
ment to the worst-case analysis. Similarly to what hap- 
pens for other famous combinatorial problem, e.g. the 
satisfiability problem of Boolean formulae, critical ran- 
dom instances of g-coloring (or polynomial mappings to 
other NP-complete problems) are a popular test-bed for 
the performance of search algorithms Q . 

In physics ^-coloring has a direct interpretation as a 
spin-glass model. A proper coloring of a graph is a 
zero-energy ground state configuration of a Potts anti- 
ferromagnet with q-state variables. For most lattices this 
system is frustrated and displays many equilibrium and 
out-of-equilibrium features of glasses ('Potts glass'). 
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Here we focus on the g-coloring problem (or Potts anti- 
ferromagnet) over random graphs of finite average con- 
nectivity, given by the Qn, p ensemble (graphs composed 
of N vertices with edge probability p for every pair of 
vertices). In the relevant limit of finite connectivity we 
have to take p = c/N which leads to random graphs with 
a Poissonian connectivity distribution of mean c. 

Two types of questions can be asked. One type is al- 
gorithmic, i.e. finding an algorithm that decides whether 
a given graph is colorable. The other type is more theo- 
retical and amounts to asking whether a typical problem 
instance is colorable or not and what is the typical struc- 
ture of the solution space. Here we address the latter 
question using the so called cavity method || . 

Let us start with reviewing some known results on the 
g-COL /UNCOL transition on random graphs. One of the 
first important finite-connectivity results was obtained by 
Luczak about one decade ago ]l0| |. He proved that the 
threshold asymptotically grows like c q ~ 2q In q for large 
numbers of colors, a result, which up to a pre-factor coin- 
cides with the outcome of a replica calculation on highly 
connected graphs jllj (p = 0(1) for large N). For fixed 
number q of colors, all vertices with less than q neigh- 
bors, i.e. of degree smaller g, can be colored for sure. The 
hardest to color structure is thus given by the maximal 
subgraph having minimal degree at least g, the so-called 
g-core. Pittel, Spencer and Wormald showed that 
the emergence of a 2-core coincides with random graph 
percolation at c = 1 0, and is continuous. For q > 3, 
however, the g-core arises discontinuously: For q = 3 they 
found, e.g., that the core emerges at c ~ 3.35 and imme- 
diately contains about 27% of all vertices. The existence 
of this core is, however, not sufficient for uncolorability: 
The best lower bound for the 3-COL/UNCOL transition 
is 4.03 |l| , numerical results predict a threshold of about 
4.7 Q. The currently best rigorous upper bound is 4.99 
[ p5[ . Most recently, a replica symmetric analysis of the 
problem has been performed |L6|. The resulting thresh- 
old 5.1 exceeds, however, the rigorous bound, and one has 
to go beyond replica symmetry. At the level of one-step 
replica-symmetry breaking (1RSB) we are able to calcu- 
late a threshold value C3 ~ 4.69 which we believe to be 
exact We also describe the solution space structure 
which undergoes a clustering transition at ~ 4.42. 

As stated above, the question if a given graph is q- 
colorable is equivalent to the question if there are zero- 
energy ground states of the anti-ferromagnetic g-state 
Potts model defined on the same graph. Denoting the 
set of all edges by E, the problem can thus be described 
by the Hamiltonian 

H= £ 5 (^i) (!) 

with 6(-, ■) denoting the Kronecker symbol. The spins 
<7j, i = 1, N, are allowed to take the g values {1, g}. 



This Hamiltonian counts the number of edges being col- 
ored equally on both extremities, a proper coloring of 
the graph thus has energy zero. In this letter we ap- 
ply the cavity method in a variant recently developed for 
finite-connectivity graphs directly at zero temperature 
]l8| , |i~9| , p0[ . This approach consists of a self-consistent 
iterative scheme which is believed to be exact over lo- 
cally tree-like graphs, like the ones we consider here. It 
includes the possibility of dealing with the existence of 
many pure states. One has to first evaluate the energy 
shift of the system due to the addition of a new new spin 
<7o • Let us assume for a moment that the new spin is only 
connected to a single spin, say a±, in the pre-existing 
graph. Before adding the new site 0, the ground-state 
energy of the system with fixed o\ can be expressed as 

1 

E (a 1 )=A-J2h 1 J(r,a 1 ) (2) 

T = l 

where we have introduced the effective field h 1 = 
(h\, ...,/iJ). Note that a (g-l)-dimensional field would be 
sufficient since one of the g fields above can be absorbed 
in A. We, however, prefer to work with g field compo- 
nents in order to keep evident the global color symmetry. 
Connecting <to to <j\ , and calculating the minimal energy 
of the enlarged graph with fixed do, this reads 

E(a ) = min A-V'/4(5(T,CTi)+(5(cro,o'i) 

= A-io(h 1 )-J2^(h 1 )6(r,a ) (3) 

T = l 

with 

to(h) = — min(— h%, — h q ) (4) 

/?\ _ J — 1 if — h T < — h%, .., — h T -i, — /Jt+i, .., — h q 
UtW ~ \ else 

The field u{h}) has at most one non-zero component, 
which takes the value —1, i.e. u(h}) e {0, — ei, — e q } 
with e T denoting a unit vector in direction r. 

If the new spin 00 is connected to d sites with fields 
h} , h d , and if these spins were previously uncorrelated 
(which is the case inside one pure state, cf. the discussion 
in j^]), the propagated fields can be linearly superposed, 
h° = J2i=i u(h l ) . Note that the fields never become 
positive, which reflects the anti-ferromagnetic character 
of the model. Colors are suppressed by neighbors carry- 
ing this color, they can be favored only by suppressing all 
other colors. If there would be a single state (replica sym- 
metry), every link would carry two propagated fields 
Ui^j and Uj^i which are determined self-consistently. In 
case of multiple states, these fields fluctuate from state to 
state and have to be characterized by a full distribution 
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Qi^j(u), cf. j|, ||(|. Due to the global color symmetry, 
each of these takes the form 

i 

Qi->j(u) = (1 - qi]i^j)S(u) + r)i->j S(u + e T ) (5) 

T = l 

and can thus be fully described by the probability rji^j 
that any of the colors of vertex j is forbidden by edge 
(T,j). Denoting the histogram of all tji^j by p(rj), 
the self-consistency equation for the distribution of the 
Qi^j(u) can be reduced to a simple equation for p(rj) 



Y- 



d=0 



dm ■■ -drjd p(?/i) ■ ■ ■ p(r) d ) 



x8(r] - f d (m, -,Vd)) 



(G) 



where fd is simply given by 



fd{m 



,Vd) 



E£o l (-i)'(^ 



;o(-i)'(^i) nti[i-G+iK] 

(7) 

This equation resembles a replica-symmetric self- 
consistent equation and can be solved numerically using 
a population dynamical algorithm: We start with an ini- 
tial population r)i, ■■■,rjj^ of size TV which can be easily 
chosen to be as large as 10 6 to generate high-precision 
data. This population is updated by iterating the follow- 
ing steps until convergence: (I) Randomly draw a number 
d from the Poisson distribution e~ c c d /d\; (II) Randomly 
select d+ 1 indices io,ii, ...,id from {1, ...,TV}; (III) Up- 
date the population by replacing m by fd{r/in - - - , rf ) - 

One obvious solution of Eq. (0) is the paramagnetic 
solution 5(ri). For small average connectivities c it is 
even the only one. The appearance of a non-trivial so- 
lution coincides with a clustering transition of ground 
states into an exponentially large number of extensively 
separated clusters. In spin-glass theory, this transition 
is called dynamical. Still, p{ij) will contain a non-trivial 
peak in r\ = due to small disconnected subgraphs, dan- 
gling ends etc. The weight t of this peak can be computed 
self-consistently from 



-(l-t)c 



9-2 , 

E- 

1=0 



t) l c l 



v. 



(8) 



This equation is quite interesting, since a non-trivial so- 
lution forms a necessary condition for Eq. (j|) to have 
a non-trivial solution. In fact Jl^ , the fraction of edges 
in the q-coie is given by (1 — t m i n ) with t m in being the 
smallest positive solution of Eq. (||). Thus, we also find 
that the existence of an extensive g-core is necessary for 
a non-trivial p(r)), and forms a lower bound for the q- 
COL/UNCOL transition. 

Unlike in the case of finite-connectivity p-spin-glasses 
or, equivalently, random XOR-SAT problems p2|, |23|, p4|, 



the existence of a solution t < 1 is not sufficient for a 
non-trivial p(rj) to exist. The latter appears suddenly at 
the dynamical transition q, which can be determined to 
high precision using the population dynamical algorithm. 
This solution does not imply uncolorability, but the set of 
solutions is separated into an exponentially large number 
of clusters. The logarithm of their number, divided by 
the graph size TV, is called the complexity S(c) and can 
be calculated from p(r]), cf. p(| 

^ c d f 

£(c) = e ^ c z2-jj / *7i • • • dr) d p{r}i) ■ ■ ■ p(rj d ) 
d=i ' J 

xm E(-i) z ( z + 1 )n[ i -( z + i H]J 

dr\\dr\2 p(m)p(m) m (i - ivim) ■ (9) 



The full derivation will be given in |2^]. At the dy- 
namical threshold, this complexity starts discontinuously 
with a positive value, see Fig. ^, and decreases when c 
is increased. The static RSB transition, and thus the 
q-COL/UNCOL threshold c q , are given by the point of 
vanishing complexity. At this point the number of clus- 
ters becomes sub-exponential, and disappears beyond c q . 
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FIG. 1: Top: Complexity E(c) vs. average connectivity for 
q = 3 and q — 4. Non-zero complexity appears discontinu- 
ously at the dynamical threshold c<j, and goes down continu- 
ously to zero at the q-COL/UNCOL transition. The curves 
are calculated using the population-dynamical solution for 
p(r]) with population size TV = 10 6 . 

Bottom: The full line shows the chromatic number of large 
random graphs vs. their connectivity c. The symbols give re- 
sults of smallk for TV = 10 3 , each averaged over 100 samples. 

In the following table, we present the results for 
q = 3,4, and 5. For the dynamical transition we 
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show the corresponding values of c<j, of the entropy 
s(cd) = hag + fin 2^— and the complexity S(cd). For 
the g-COL/UNCOL transition, the critical connectivity 
c q and the solution entropy are given. Like in random 
3-satisfiability |^6| and vertex covering |^7j , this entropy 
is found to be finite at the transition point. 
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s{c d ) 


E(cO 


c q 




3 


4.42 


0.203 


0.0223 


4.69 


0.148 


4 


8.27 


0.197 


0.0553 


8.90 


0.106 


5 


12.67 


0.196 


0.0794 


13.69 


0.082 



One can see that the complexity at the dynamical 
threshold grows strongly with q, whereas the total en- 
tropy decreases slowly. This means that the cluster- 
ing phenomenon becomes more and more pronounced, 
the number of clusters increases, their internal entropy 
s(c) — S(c) becomes smaller. It also becomes more rele- 
vant for small systems. At N = 100 and the dynamical 
threshold, we would predict only around 10 clusters for 
q = 3, for q = 4 this number would already be close to 
250, and grow to about 2800 for 5 colors. 

The dynamical transition is not only characterized by 
a sudden clustering of ground states, at the same point 
an exponential number of meta-stable states of positive 
energy appears |20 . Such states are expected to act as 



traps for local search algorithms causing an exponential 
slowing down of the search process. Well known examples 
of search processes that are overwhelmed by the presence 
of excited states are simulated annealing or greedy algo- 
rithms based on local information. 

To test this prediction, we have applied several of the 
best solvers for COL and SAT problems available in the 
net j|, S3). The best results could be obtained using 
the complete smallk program ]28[ ] which may need expo- 
nential time to find a proper minimal coloring. Using a 
cutoff time (we probed with 10 seconds, 1 minute and 2 
minutes without substantial changes for N = 10 3 ), the 
algorithm can be restricted to sub-exponential times, i.e. 
only the underlying polynomial-time heuristic is applied. 
The results in Fig. 1 were obtained in the following way: 
We first tried to color a random graph (N = 10 3 ) with 
a small number of colors (here q = 3). If, after the cut- 
off time, smallk did not find a coloring, we stopped and 
retried with larger q. For each connectivity we averaged 
over 100 samples. As it can be clearly seen, the algo- 
rithm fails with q colors slightly below the dynamical 
transition, confirming our expectations. Only a perfect 
local heuristic should reach this threshold. 

We conclude by noticing that, in similarity to the 3- 
SAT problem Q , we expect the assumptions underlying 
the cavity approach to hold for single instances of COL. 
The equations for the order parameter on single instances 
provide the full histogram of the N probability distribu- 
tions of the effective fields, one for each variable, which 
describe the fluctuations of the polarization of each Potts 



variable in the ground states. On the physics side, this 
information allows to develop a single sample statistical 
mechanics analysis whereas on the algorithmic side it al- 
lows develop new algorithms (25). 

We are grateful to A. Braunstein, J. Culberson and F. 
Ricci-Tersenghi for interesting discussions. 
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